The loss surfaces of neural networks with general activation functions
نویسندگان
چکیده
The loss surfaces of deep neural networks have been the subject several studies, theoretical and experimental, over last few years. One strand work considers complexity, in sense local optima, high dimensional random functions with aim informing how optimisation methods may perform such complicated settings. Prior Choromanska et al (2015) established a direct link between training multi-layer perceptron spherical multi-spin glass models under some very strong assumptions on network its data. In this work, we test validity approach by removing undesirable restriction to ReLU activation functions. doing so, chart new path through spin complexity calculations using supersymmetric Random Matrix Theory which prove useful other contexts. Our results shed light both strengths weaknesses context.
منابع مشابه
Stochastic Neural Networks with Monotonic Activation Functions
We propose a Laplace approximation that creates a stochastic unit from any smooth monotonic activation function, using only Gaussian noise. This paper investigates the application of this stochastic approximation in training a family of Restricted Boltzmann Machines (RBM) that are closely linked to Bregman divergences. This family, that we call exponential family RBM (Exp-RBM), is a subset of t...
متن کاملDeep Neural Networks with Multistate Activation Functions
We propose multistate activation functions (MSAFs) for deep neural networks (DNNs). These MSAFs are new kinds of activation functions which are capable of representing more than two states, including the N-order MSAFs and the symmetrical MSAF. DNNs with these MSAFs can be trained via conventional Stochastic Gradient Descent (SGD) as well as mean-normalised SGD. We also discuss how these MSAFs p...
متن کاملRecurrent neural networks with trainable amplitude of activation functions
An adaptive amplitude real time recurrent learning (AARTRL) algorithm for fully connected recurrent neural networks (RNNs) employed as nonlinear adaptive filters is proposed. Such an algorithm is beneficial when dealing with signals that have rich and unknown dynamical characteristics. Following the approach from, three different cases for the algorithm are considered; a common adaptive amplitu...
متن کاملComplex-valued Neural Networks with Non-parametric Activation Functions
Complex-valued neural networks (CVNNs) are a powerful modeling tool for domains where data can be naturally interpreted in terms of complex numbers. However, several analytical properties of the complex domain (e.g., holomorphicity) make the design of CVNNs a more challenging task than their real counterpart. In this paper, we consider the problem of flexible activation functions (AFs) in the c...
متن کاملNeural Networks with Smooth Adaptive Activation Functions for Regression
In Neural Networks (NN), Adaptive Activation Functions (AAF) have parameters that control the shapes of activation functions. These parameters are trained along with other parameters in the NN. AAFs have improved performance of Neural Networks (NN) in multiple classification tasks. In this paper, we propose and apply AAFs on feedforward NNs for regression tasks. We argue that applying AAFs in t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Statistical Mechanics: Theory and Experiment
سال: 2021
ISSN: ['1742-5468']
DOI: https://doi.org/10.1088/1742-5468/abfa1e